Skip to content

Fix Universe CI/CD pipeline test and deploy handoff#18

Open
businesscurry123 wants to merge 4 commits into
BAWES-Universe:universefrom
businesscurry123:bounty/10-cicd-universe-pipeline
Open

Fix Universe CI/CD pipeline test and deploy handoff#18
businesscurry123 wants to merge 4 commits into
BAWES-Universe:universefrom
businesscurry123:bounty/10-cicd-universe-pipeline

Conversation

@businesscurry123

@businesscurry123 businesscurry123 commented May 15, 2026

Copy link
Copy Markdown

Issue

Addresses #10
/claim #1

Summary

This PR tightens the Universe CI/CD path without adding unconfigured production SSH secrets.

  • test-universe-images.yml now checks out the exact commit that triggered the build workflow, so the production-image tests use compose files from the same revision as the images being tested.
  • Manual Universe image tests default to the universe tag instead of latest, because latest is only emitted when universe is the default branch.
  • build-universe-images.yml now also runs when Universe compose overrides or Universe helper scripts change.
  • A single deploy-universe job runs after all image-test shards pass and calls UNIVERSE_DEPLOY_WEBHOOK when configured.
  • docs/cicd.md documents the PR validation, image build, image test, and production webhook deployment contract for universe.bawes.net.
  • docs/dev-workflow.md documents the day-to-day Universe branch flow, local compose smoke test, deploy status checks, and host-side rollback path.

Acceptance criteria

  • Read build-universe-images.yml and documented the image build coverage.
  • Read test-universe-images.yml and fixed the workflow/test revision alignment.
  • Read continuous_integration.yml and documented the PR validation coverage.
  • Documented the Universe CI/CD flow in docs/cicd.md.
  • Added docs/dev-workflow.md for the daily PR-to-merge-to-deploy loop.
  • Documented rollback to immutable universe-<sha> image tags without inventing unconfigured SSH secrets.
  • Identified the missing direct production secret surface and added a safe webhook-based deployment handoff.
  • Kept the change scoped to CI/CD: Universe deployment pipeline (build → test → deploy to universe.bawes.net) #10 owned paths: .github/workflows/build-*, Universe test/deploy workflow behavior, cd/ documentation, and CI/CD docs.

Verification

  • git diff --check
  • Parsed .github/workflows/build-universe-images.yml with Python YAML.
  • Parsed .github/workflows/test-universe-images.yml with Python YAML.

@coderabbitai

coderabbitai Bot commented May 15, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

This PR configures GitHub Actions workflows and documentation for the Universe branch. Build triggers are extended to respond to compose and script changes. Test workflow is constrained to the universe branch with aligned checkouts, and a new deployment job posts to a webhook after tests pass. Complete CI/CD documentation includes workflow descriptions, deployment contract, and operational procedures.

Changes

Universe CI/CD Pipeline

Layer / File(s) Summary
Build workflow trigger updates
.github/workflows/build-universe-images.yml
The push.paths filter is extended to trigger builds when contrib/docker/docker-compose.universe.yaml or matching scripts/*universe* files change.
Test workflow standardization and configuration
.github/workflows/test-universe-images.yml
The workflow_run trigger is constrained to the universe branch, the workflow_dispatch docker_tag default changes from latest to universe, checkout is aligned to the triggering workflow's head_sha, and the docker tag fallback becomes universe instead of latest.
Post-test deployment webhook job
.github/workflows/test-universe-images.yml
A new deploy-universe job is added that runs after test-universe-images; it POSTs to the UNIVERSE_DEPLOY_WEBHOOK secret when configured, otherwise logs a notice and exits successfully.
CI/CD documentation and operational guide
docs/cicd.md
Documents the universe branch CI/CD flow, workflow trigger rules, image build and test steps, deployment contract including GHCR tag conventions, operational merge checklist, and audit notes on checkout alignment and webhook behavior.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

Poem

🐰 A pipeline builds in the Universe light,
Docker composes and scripts unite,
Tests run sharded, true and fair,
Then the webhook floats through the air! ✨
From build to deploy, all checkboxes right. 🚀

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix Universe CI/CD pipeline test and deploy handoff' accurately and concisely summarizes the main changes: tightening the test workflow to use correct commit revisions and adding webhook-based deployment coordination between test and deploy jobs.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
docs/cicd.md (1)

83-87: ⚡ Quick win

Clarify why discord-bot-universe and bot-server-universe are excluded from the deployment contract.

This section defines the production contract but omits two images that are explicitly built earlier (Line 53-55). Add a short note that they are intentionally non-production here, or include them if they are part of the Universe runtime set.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/cicd.md` around lines 83 - 87, The docs list production images
(play-universe, back-universe, map-storage-universe, uploader-universe) but
omits discord-bot-universe and bot-server-universe; update the cicd.md section
around the image list to either include those two images if they belong in the
production Universe runtime, or add a concise clarifying sentence stating that
discord-bot-universe and bot-server-universe are intentionally excluded from the
production contract (they are built earlier but are
non-production/runtime-only). Reference the image names (discord-bot-universe,
bot-server-universe, play-universe, back-universe, map-storage-universe,
uploader-universe) so the change is clear.
.github/workflows/test-universe-images.yml (2)

189-193: ⚡ Quick win

Consider gating the deploy on a GitHub Environment and a concurrency group.

Two operational nits for a production handoff job:

  1. environment: production — promotes this from a free-running job to one that can carry environment-scoped secrets (so UNIVERSE_DEPLOY_WEBHOOK doesn't have to be repo-wide), optional required reviewers, and surfaces a deployment URL on the run summary. Especially useful given workflow_dispatch can be invoked from any branch.
  2. concurrency: { group: deploy-universe, cancel-in-progress: false } — serializes overlapping deploys when several universe pushes land close together, so the webhook receiver isn't asked to interleave deployments.
♻️ Sketch
   deploy-universe:
     name: "Deploy Universe"
     needs: test-universe-images
     runs-on: ubuntu-latest
     if: ${{ github.event_name == 'workflow_run' || github.event_name == 'workflow_dispatch' }}
+    environment: production
+    concurrency:
+      group: deploy-universe
+      cancel-in-progress: false
     steps:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-universe-images.yml around lines 189 - 193, The
deploy-universe job is missing environment and concurrency controls; update the
deploy-universe job definition to add an environment (e.g., environment:
production) so it can use environment-scoped secrets and required reviewers, and
add a concurrency block (e.g., concurrency: { group: "deploy-universe",
cancel-in-progress: false }) to serialize overlapping runs; locate the job named
deploy-universe in the workflow and insert those two fields at the job level.

189-205: ⚡ Quick win

Harden the webhook call: add a timeout (and consider retry/observability).

The deploy handoff is the single externally-visible action in this pipeline and has no timeout. If UNIVERSE_DEPLOY_WEBHOOK is slow, returns no response, or stalls TCP, the runner sits idle up to the default 360-minute job timeout, blocking subsequent deploys from being queued cleanly. Adding --max-time (and ideally a bounded retry with backoff) keeps the handoff predictable.

Also consider --silent --show-error so success runs stay quiet but a failure surfaces the response body with status — --fail-with-body already gives non‑zero on 4xx/5xx, but pairing it with -S keeps stderr useful on actual failures.

♻️ Suggested hardening for the curl call
-          curl --fail-with-body --request POST "$UNIVERSE_DEPLOY_WEBHOOK"
+          curl \
+            --fail-with-body \
+            --silent --show-error \
+            --request POST \
+            --max-time 30 \
+            --retry 3 --retry-delay 5 --retry-connrefused \
+            "$UNIVERSE_DEPLOY_WEBHOOK"

Tune --max-time/--retry* to match your production webhook's expected SLA (and ensure the receiver is idempotent if retries are enabled).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-universe-images.yml around lines 189 - 205, The
deploy-universe job's curl invocation lacks timeouts and observability; update
the curl command that posts to the UNIVERSE_DEPLOY_WEBHOOK (the line containing
curl --fail-with-body --request POST "$UNIVERSE_DEPLOY_WEBHOOK") to include a
total timeout (e.g. --max-time 30), make failures visible but keep successful
runs quiet (e.g. --silent --show-error / -sS), and add a bounded retry policy
(e.g. --retry 2 --retry-delay 5 --retry-connrefused) so the job won't hang
indefinitely and transient errors are retried in a controlled way.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test-universe-images.yml:
- Around line 189-205: Add a boolean "deploy" workflow_dispatch input (default
false) so manual runs must opt-in to production deploys, and update the
deploy-universe job's conditional (the if for the deploy-universe job) to only
trigger on workflow_run OR on workflow_dispatch when github.event.inputs.deploy
is true; specifically add the deploy input under workflow_dispatch inputs and
change the existing if expression on the deploy-universe job (the job named
deploy-universe) to check (github.event_name == 'workflow_run') ||
(github.event_name == 'workflow_dispatch' && github.event.inputs.deploy ==
'true').

---

Nitpick comments:
In @.github/workflows/test-universe-images.yml:
- Around line 189-193: The deploy-universe job is missing environment and
concurrency controls; update the deploy-universe job definition to add an
environment (e.g., environment: production) so it can use environment-scoped
secrets and required reviewers, and add a concurrency block (e.g., concurrency:
{ group: "deploy-universe", cancel-in-progress: false }) to serialize
overlapping runs; locate the job named deploy-universe in the workflow and
insert those two fields at the job level.
- Around line 189-205: The deploy-universe job's curl invocation lacks timeouts
and observability; update the curl command that posts to the
UNIVERSE_DEPLOY_WEBHOOK (the line containing curl --fail-with-body --request
POST "$UNIVERSE_DEPLOY_WEBHOOK") to include a total timeout (e.g. --max-time
30), make failures visible but keep successful runs quiet (e.g. --silent
--show-error / -sS), and add a bounded retry policy (e.g. --retry 2
--retry-delay 5 --retry-connrefused) so the job won't hang indefinitely and
transient errors are retried in a controlled way.

In `@docs/cicd.md`:
- Around line 83-87: The docs list production images (play-universe,
back-universe, map-storage-universe, uploader-universe) but omits
discord-bot-universe and bot-server-universe; update the cicd.md section around
the image list to either include those two images if they belong in the
production Universe runtime, or add a concise clarifying sentence stating that
discord-bot-universe and bot-server-universe are intentionally excluded from the
production contract (they are built earlier but are
non-production/runtime-only). Reference the image names (discord-bot-universe,
bot-server-universe, play-universe, back-universe, map-storage-universe,
uploader-universe) so the change is clear.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a78b6a03-4002-4eb4-b199-2d23bb093274

📥 Commits

Reviewing files that changed from the base of the PR and between 1dd3ad1 and 1a40b29.

📒 Files selected for processing (3)
  • .github/workflows/build-universe-images.yml
  • .github/workflows/test-universe-images.yml
  • docs/cicd.md

Comment thread .github/workflows/test-universe-images.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant